Entity Identification in Database Integration
نویسندگان
چکیده
The objective of ent i ty identification i s t o determine the correspondence between object instances f r o m more than one database. This paper ezamines the problem at the instance level assuming that schema level heterogeneity has been resolved a priori . Soundness and completeness are defined as the desired properties of any ent i ty identification technique. To achieve soundness, a set of ident i ty and distinctness rules are established for enti t ies in the integrated world. W e propose the use of eztended key, which i s the union of keys (and possibly other attributes) f r o m the relations t o be matched, and i t s corresponding ident i ty rule, t o determine the equivalence between tuples f r o m relations which m a y not share any common key. Instance level funct ional dependencies (ILFD), a f o r m of semantic constraint information about the real-world entities, are used t o derive the missing eztended key attribute values of a tuple.
منابع مشابه
Mining Entity-Identification Rules for Database Integration
Entity identification (EI) is the identification and integration of all records which represent he same realworld entity, and is an important task in database integration process. When a common identification mechanism for similar records across heterogeneous databases is not readily available, EI is performed by examining the relationships between various attribute values among the records. We...
متن کاملData Integration Using Data Mining Techniques
Database integration provides integrated access to multiple data sources. Database integration has two main activities: schema integration (forming a global view of the data contents available in the sources) and data integration (transforming source data into a uniform format). This paper focuses on automating the aspect of data integration known as entity identification using data mining tech...
متن کاملRadio Frequency Identification (RFID): A Technology for Enhancing Computerized Maintenance System (CMMS)
Abstract While Computerized Maintenance Management System (CMMS) enables maintenance managers and supervisors to access information about equipment, manpower and maintenance policies, there is still a need to facilitate getting data/information into the backend database where it can be utilized by the organization as information to make decisions regarding the operation of the organization. Si...
متن کاملIntegration of Fuzzy Databases: Problems & Solutions
In this paper, problems in integration of fuzzy relational databases have been investigated and some solutions have been proposed. In general, database integration consists of two main processes called schema integration and instance integration that results into global schema and global instance respectively. Current work assumes a schema integration process to get a global schema from a colle...
متن کاملEntity identification for heterogeneous database integration--a multiple classifier system approach and empirical evaluation
Entity identification, i.e., detecting semantically corresponding records from heterogeneous data sources, is a critical step in integrating the data sources. The objective of this research is to develop and evaluate a novel multiple classifier system approach that improves entity identification accuracy. We apply various classification techniques drawn from statistical pattern recognition, mac...
متن کاملEntity Identification in XML Documents
Abstract: As a natural result of the dissemination of a large variety of XML databases, the well-known problem of data integration must be faced from the XML viewpoint One of the basic functions of an integration system is the record linkage, the task of comparing records to determine those that are differently represented, but relate to the same entity. As a consequence of the intrinsically hi...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
دوره شماره
صفحات -
تاریخ انتشار 1993